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Abstract 

Estimation of the covariance structure of spatial processes is of fundamental 
importance in spatial statistics. In the literature, several non-parametric and 
semi-parametric methods have been developed to estimate the covariance struc¬ 
ture based on the spectral representation of covariance functions. However, 
they either ignore the high frequency properties of the spectral density, which 
are essential to determine the performance of interpolation procedures such as 
Kriging, or lack of theoretical justification. We propose a new semi-parametric 
method to estimate spectral densities of isotropic spatial processes with irregu¬ 
lar observations. The spectral density function at low frequencies is estimated 
using smoothing spline, while a parametric model is used for the spectral density 
at high frequencies, and the parameters are estimated by a method-of-moment 
approach based on empirical variograms at small lags. We derive the asymptotic 
bounds for bias and variance of the proposed estimator. The simulation study 
shows that our method outperforms the existing non-parametric estimator by 
several performance criteria. 
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1. Introduction 


In geostatistics, covariance function is the most common tool modelers use 
to describe the spatial dependence structure in the data, and it is a crucial 
ingredient in kriging prediction [1]. The covariance function has to be positive 
definite in order to ensure that the variance of any linear combinations of values 
of the process at various locations is positive: 

n n 

'y y <iiajC{si — Sj) > 0, 

i=l j=l 

for any n real numbers {oi,..., a„}, and spatial locations {si,..., s„} C 
where d is the dimension of the spatial domain. A common solution is to use a 
parametric family of covariance functions that are positive definite. Weighted 
least square methods [2] and likelihood-based methods [3, 4] can then be used to 
estimate parameters. However, it is not always clear what the parametric forms 
should be, and model misspecification can lead to bad kriging performance. 

Due to the positive definite constraint, it is difficult to apply non-parametric 
techniques directly to estimate the covariance function in the spatial domain. 
Bochner’s Theorem [5] shows that a function is continuous and positive definite 
if and only if it is the Fourier transform of a positive bounded measure F on 

C'(x) = j exp{i(jjx)F{du}). (1) 

JR'i 

In the case where F has a density /, which is called the spectral density, (1) 
can be rewritten as 

C'(x) = j exp{iojx)f{uj)duj. (2) 

Jr‘‘ 

For example, for isotropic processes (2) is reduced to a one-dimensional integral 

C(r) = 2(‘'"2)/2r(d/2) / {ru)~^‘^~‘^'>/‘^J(d- 2 )/ 2 {ru)f{u)du, (3) 

Jo 

where r(-) is the Gamma function, and J,y(-) is the Bessel function of the first 
kind of order ly [6]. In the spectral domain the positive definite constraint 
translates to a non-negative constraint on the spectral density which is much 
easier to work with. 
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As a result, many estimation methods for the covariance function have been 
proposed based on its spectral representation. In the time series literature, much 
of the analysis of the spectral representation focus on smoothing periodograms, 
which can be constructed easily for observations on grids. See [7, 8, 9, 10, 11, 
12]. Many of these approaches can be generalized to apply to spatial data on 
grids. Non-parametric modeling of the covariance function and its spectrum 
for irregularly spaced data include [13, 14, 15, 16, 17, 18]. However, these 
methods do not properly take the tail property of the spectral density function 
into consideration. For example, the nonparametric estimator /(w) of Huang 
et al. (2011b) can only take value on a bounded interval [OjWc] for some cutoff 
value Wc, and /(w) = 0 for w > lOc- Thus the estimated covariance function is a 
finite-range integral 

r^c 

C{h) = 2 cos{hu})f{uj)duj, 

Jo 

which leads to {cf"^C{h)/dh^"^}\h=o exists and is finite for any m > 0. A 
random process A'(s) with such a covariance function is infinitely smooth. Stein 
(1999, pg. 30) argues that such smoothness is unrealistic for physical processes 
under normal circumstances. The resulting nonparametric estimator of the 
covariance function can be problematic in kriging. 

Im et al. (2007) proposed a flexible family of models for the spectral density 
function that is a linear combination of cubic splines up to a cutoff frequency Wc 
and an algebraically decaying tail from lOc to inhnity. They used a likelihood- 
based method to estimate the cutoff value and the decay rate assuming the 
process is a Gaussian random field. Simulation studies indicate that their es¬ 
timator can perform well empirically. Two limitations of their paper are the 
following: First, no formal theoretical justification for their method has been 
developed to date. Second, the estimation method is computationally demand¬ 
ing and can not scale to large data sets. 

Following Im et al. (2007), we consider a similar semi-parametric method 
for estimating spectral density of an isotropic Gaussian random process which 
addresses both issues. In our proposed method, the spectral density function 
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is modeled by smoothing splines for low frequencies up to a cutoff frequency, 
which enjoys flexible functional forms, and an algebraic tail for high frequencies. 
The estimator of the spectral density function at low frequencies can be solved 
by a regularized inverse problem [17]. To estimate the delay rate in the alge¬ 
braic tail for high frequencies, we employ a Method-of-Moment approach. Our 
method provides a closed-form solution which allows for theoretical analysis, 
and we derive asymptotic bounds for the bias and variance of the spectral den¬ 
sity estimator. The estimation algorithm is also scalable to large spatial data 
sets. We would like to note that both the theoretical results and the algorithm 
are developed for one-dimensional spatial processes. Generalization to higher 
dimensions will be addressed in a separate paper. 

The rest of the paper is organized as follows. Section 2 presents our method¬ 
ology. In this section, we describe our estimation procedure and provide a closed- 
form solution. Sections 3 contains the asymptotic results. Section 4 presents a 
simulation study. Section 5 concludes. Proofs are provided in the Appendix. 

2. Methodology 

Consider an isotropic Gaussian random process Ar(s) at s = Si, 1 < i < 
N, where {si,...,SAr} C M are irregularly spaced locations. Without loss of 
generality, we assume that ^(s) has mean zero and locations {si,..., sn} satisfy 
some weak regularity conditions to be specified later in Section 3. For example, 
locations following a Poisson process would satisfy those conditions. Following 
Im et al. (2007), we do not posit any parametric form for the spectral density 
function at low frequencies up to a cutoff frequency Wc, and assume an algebraic 
tail for the spectral density at high frequencies: 

fHl) = /(w)/[0,c;,](w) + <t^ 

where 7 is the decay rate. The decay rate of the spectral density function and the 
smoothness parameter of the covariance function are closely related. In Matrn 
class, suppose that v is the Matrn smoothness parameter, then 7 = 2v + d where 
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d is the dimension of space. To derive explicit theoretical results, in this paper 
we only consider random processes in one dimension. The methodology itself is 
more general and can be adapted to stationary processes in higher dimensions. 

We begin by outlining the estimation steps. For estimation of the spectral 
density at low frequencies up to the cutoff value ujc, we follow the approach in 
[17] (HHCll from hereon). We set a grid on the range of observations with 
grid size A = tt/wc, and project the irregularly observed points to their nearest 
grids. We refer to this preprocess step as gridization. Note that the resulting 
gridized data is still different from time series in that some grids may have zero 
observation while some grids may have multiple observations. Thus the classical 
spectral density estimation methods based on the periodograms in time series 
[21, 22, 23, 24] are not suitable. We use the smoothing spline estimation method 
introduced in HHCllb. The estimator is obtained by solving a regularized 
inverse problem. 

The price we pay by projecting irregular data onto grids is that the estimand 
in focus, the spectral density function /a(w) based on the gridized data, is 
different from the true spectral density function /(w), due to aliasing. The 
relationship between /a and / is given by 

OO 

/a(w)= f{u} + 2jujc) (4) 

j=-oo 

for oj e [0,Wc]. The equation (4) allows us to correct the aliasing effect if we 
know the tail of the spectral density. 

For estimation of the spectral density at high frequencies from tUc to oo, we 
focus on estimating the decay rate 7. As mentioned before, the decay rate 7 and 
the smoothness parameter of the variogram function 7(h) are closely related. 
Using Taylor expansion, we have 

7(/i) = d/i]""+0(l/rl“'>+“i) , (5) 

where ao G (0,2), and ai >0. (2 — q;o/2) is also referred to as the fractal 
dimension of the process. The parameter ao and the decay rate 7 are linked 
by Qfo = 7 — 1 . Researchers have been proposed methods in estimation of the 
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fractal dimension of the sample path of a random process based on an equally 
spaced sample [25, 26, 27, 28, 29, 30, 31]. We consider estimating oq based on 
empirical variograms constructed from the irregularly spaced data. Let 7 (/i) be 
the empirical variogram at a small lag h. From equation (5), we expect 

7(/i)4C'h“«, (6) 

and 

log 7 (h) 4 c + aolog/i, (7) 

as /i —> 0, where c = log C. In this regard, estimation of ag can be turned into a 
conventional regression problem. Let do be a least square estimate from (6) or 
a regression estimate of log 7 (h) on log h from (7), it is expected that do 4 ao: 
as h —)■ 0. 

We describe the proposed estimating procedure and the mathematical for¬ 
mulations explicitly in the rest of Section 2. 


2.1. Smoothing spline estimation of spectral density at low frequencies 


We Hrst set a grid {kA, k = 1, 2, • • • } with grid size A = tt/oJc (wc > 0) in 
the range of the observations and project the irregularly observed points onto 
the nearest grid. A reasonable choice for the cutoff value ujc is pvr, where p is 
the average sampling rate [32, 33, 34]. From the gridized observations, we can 
estimate the spectral density /a on [0, Wc]- Following HHCllb, we consider the 
spectral density function estimator belonging to a Sobolev space Wi = {g on 
[OjWc); g,g' are absolutely continuous and [g'(a;)]^da; < oo}. Consider the 
following minimization problem over the functions g in Wi , 



Since the product A(si)A(sj) is an unbiased estimator of 


C{s, 



Sj)u)f\{uj)duj, 
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the first term in (8) is small for a function g close to /a- The second term is 
a roughness penalty term with A being the smoothing parameter. Without the 
penalty term the solution to (8) is unstable and non-unique. The roughness 
penalty term stabilizes the problem to a well-posed problem. The regularized 
inverse problem (8) gives a closed form solution as 

= + (9) 

Wc no ^ nfc -I- 2[kTTyX 


where Sk = J2(s- s )GLk n^ is the number of location pairs in Lk, 

and Lk = : Si € fci7r/a;c±7r/(2u;c), G jujc±'!rj{2ujc), \ki-kj\ = k}, 

where a ± 5 is a notation for interval [a — 6, a -I- &]. To simplify the presentation, 
we refer the readers to HHCllb for derivation of the solution (9). 

A data-driven method of choosing the smoothing parameter A was discussed 
in HHCllb where a generalized cross validation approach for smoothing splines 
[35] is utilized. 

Note that based on (9), we can derive a closed-form formula for the covari¬ 
ance function estimator as 


C{h) = 


/a,a(w) cos{ujh)duj 


( 10 ) 


1 1 


K 


^ 2 cosiknuj/ujc) ^ \ / tx , 

I I -^o + — Yl — <0(1. cos{ujh)duj 

/o yuJcno uJc-^ nk+ 2[kTTyX J 


So sin(wc/i) 
Uq Loji 


K 

E 

k=l 


Sk 


f sin(A:7r -|- ujch) sin(A:7r — uich) 


nk + 2{k'KyX \ kiT + ojch 


kiT — LOrh 


We refer to (9) and (10) as HHC spectral density estimator and HHC covariance 
function estimator. It is easy to see that d?'^C{h)/dhS‘'^\h=o exists and is finite 
for any m > 0. A random field Z{s) with such covariance function is infinitely 
smoothness and is often unrealistic for physical processes. 


2.2. Estimation of the decay rate 

We consider estimating ao in (5) based on empirical variograms with small 
lags constructed from the irregularly spaced data. Let 'j{h) be empirical var- 
iogram with lag h. For irregularly located data, it is rare that the distance 
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between any pairs of observations is the same. We use tolerance regions [36]. 
For a given spatial lag /im, we define a tolerance region Tm which includes all 
pairs {si, Sj) with hm — Sm < hij = jjsi — || < hm + where Sm is a prespec¬ 

ified tolerance size with 6m/hm = o(l)- Let the empirical variogram estimate 
at lag hm be 


'^m 



E 

isi,Sj)GTj 




( 11 ) 


where Zij = [^(si) — X{sj)]^, and Nm is the number of pairs of observations in 
the tolerance region T^. After going through M prespecified small spatial lags 
hm, m = 1,. ■ ■, M, we obtain a sequence of triples {hm, Um, Nm), which stands 
for the spatial lag, empirical variogram estimate, and the number of pairs at 
this lag. The size of the tolerance region Sm affects the bias and variance of 
the empirical variogram Um- If <5m is small, the bias of Um is small, however 
the variance of Um can be large due to small sample size. If 5m is large, the 
variance of Um is small since more samples are used to construct Um, however 
the bias can be large. To see this, for an individual term Zij in (11), since 
\hij — hm\ < 5m, by Taylor expansion we have 


E[Zi,j] = l{hr,j) 

= l{hm) + O {h^-Hm) 

where the second and third equality follow from (5). Since Um is the average of 
Nm these terms, we have 


E [Um] = Chm^° + O + O {h2-^5m) ■ (12) 

Thus the bias of Um is O {h'^~^5m)- The approximated variance of the var¬ 
iogram estimate [2] is 2u^/Nm- They together explain the aforementioned 
tradeoff between the bias and variance for a given hm and determine the large 
sample properties of our proposed estimator of ao which we will see in Theorem 

1 . 
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From equation (7), we have turned estimation of ao to a conventional regres¬ 
sion problem. Let ao,OLS be a regression estimator of ao by regressing log Mm 
on log hjn, m = 1,..., M, i.e. 


0:0,OLS = 


Em^llog Urn (log/t m log^M) 

Em=l (log/lm -log/lM)^ 


(13) 


where log/i^ = M ^ log derive the asymptotic bound for the 

mean-squared error of ao^oLS s-s in Theorem 1. 


2.3. Adjusting for Aliasing and the final spectral density estimator 

Analysis based on the gridized data focus on estimation of /a(w), which is 
different from the true spectral density /(w), due to aliasing. We have obtained 
/a(w) for uj G [ 0 , ujc] and an estimated algebraic form for w G [wc, 00 ), 

where 7 = do -|- 1. We can adjust for aliasing using equation (4) to get the 
spectral density estimator 

/(w) = /a(w) - ./(w -g 2juJc) (14) 

iGo ^ / 

for u) G [0,Wc]- The parameter is the value of spectral density evaluated at 
uJc, which is chosen to guarantee that the semi-parametric estimator of spectral 
density is continuous at the cutoff point Wc. After some algebra, </> can be 
estimated by 

7 ^ fAjuJc) 

Er=-oo(i+2j)-^’ 

Thus, our final estimator of spectral density, referred to as YZ estimator, takes 
the form 

1 « / \ -7 

. UJ>UJa 
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By plugging in the form of /a(<^) and we obtain a closed form for YZ estimator 
as 


/(w) 




cos{k7ruj— cos{k7z) ^ 
nfc+2(fc7r)2A 



to > Wc 


where 



The closed-form estimator allows us to study the large sample properties of the 
proposed estimator, which is presented in Theorem 2. 

Lastly, from (15) it is possible for /(w) to have negative values. To remove 
the negativity, a practical solution is to consider 


/+(u;) = max{/(a;),0}. 


From our simulation study, we found this is not a big concern. In addition, in 
Theorem 2 we show that /(w) is consistent to /(w), so that when we have more 
samples, /(w) is guaranteed to be positive. 


3. Asymptotic Results 


Assume the following conditions: 


(C.l) Let X be an isotropic random process on M. X has the following linear 
process representation: 


X{s) 


a{s — t)dZ{t), s € M, 


where f a^(s)ds < oo, and Z has stationary independent increments with 
mean zero, the second moment E [Z (ds)]^ = ds, and the forth moment 
E [Z (ds)]^ = ^ids for /i 4 < oo. 

(C.2) Let (3{s) = sup|5|<,r/(.c;o + '^)l> some wq > 0. There exists a 

bounded, symmetric function B with B{s) decreasing for s > 0, and 
B{s) < for all large s, such that 


J \/3{u)P{u + s)\du < B{s), for all s; (16) 
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and 


A-tt klT 

sup uj~^ |/?(-h u)/3{ -h u + s)| < B{s), for all u, s. (17) 

(C.3) The covariance function C{s) = £l[X(s)X(0)] is differentiable and 
/sup| 5 |<^/^g |C(^)(s + (5)|(is < oo, where C(^^(s) = dC{s)/ds. 

(C.4) Let N be the sample size and Uk be the number of pairs of gridized data 
with spatial lag kAjcOc- There exist some C,S G (0,1), such that 


inf rifc > 6N. (18) 

k<CN 

The assumption that an isotropic random process X has a spectral density im¬ 
plies that X has the linear process representation. Thus the condition (C.l) is a 
necessary condition. We assume additionally that Z has independent increments 
to simplify the derivation. It is easy to show from (C.l) that the covariance func¬ 
tion C{s) = J a(u)a(u + s)du. Hence, (16) implies that |C'(s)| < B{s) for all s. 
The condition (C.2) then implies that X is a short-memory process. Note that 
the left hand side of (17) approximates the left hand side of (16) if w is large. 
Thus, (17) is not a strong condition given (16). The condition (C.3) requires 
the covariance function to be sufficiently smooth. The condition (C.4) guaran¬ 
tees that there are sufficiently many pairs of data associated with each small 
lag compared with the sample size. This condition is satisfied if we project the 
irregularly scattered data points into a grid with grid size less than or equal to 
1 /the average sampling rate. 

In what following, we show the asymptotic properties of our estimators. All 
proof are given in the Appendix. 


Theorem 1. Let hm ~ IV and Sm ^ N ^ such that 0 < 6 < 5' < 1, where 
the notation ~ can be read as the same order as. Let do be given by (13), then 


E 


(do — ao)^ 


= O (max (log X)-^^ . 


(19) 


The optimal rate of do is N 2 ai/( 2 Qi-i-i)^jQg 2 ^ which can be achieved when 
b = b' = l/{2ai + l). 
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Remark 1. We require 5m to be smaller than hm so that the empirical vari- 
ograms are consistent to the true variograms. Specifically, we choose 0 < 5 < 
6 ' < 1 to balance the bias term (12) and the variance of the empirical vari¬ 
ograms, respectively. The mean-squared error of do is the sum of two terms. 
The first term is the squared bias term of do due to ignoring the high order 
term in equation (5). The second term is the variance term of do, 

contributed from the variance of the empirical variogram Um- 


Remark 2. Equation (19) indicates that the convergence rate for do deteriorates 
as «! —> 0. Our simulation study (not included in the paper) shows that the 
variance of do is quite stable for all ai € (0,1], while the bias increases as 
oi —> 0 for fixed N. This is consistent with the theoretical result that the 
variance term 0{N^ ~^{logN)~'^) does not depend on ai, and the deteriorative 
rate is due to the bias term {\ogN)~^). Kent et al. (1997) discussed a 

similar problem, and proposed new estimators based on higher order difference of 
observations on grids, whose bias does not depend on ai anymore. It is possible 
to generalize their results to irregular spaced data, which we did not pursue 
in this paper. Commonly used covariance function such as the exponential 
covariance function correspond to oi = 1. Same is true for Matrn covariance 
functions with the smoothing parameter ly = m + 1/2 for some integer m. 


Theorem 2. Let /(w) be our proposed spectral density estimator (15). Under 
the conditions (C.1)-(C.4), for uj € [0, oo) and A € [iV“^,7V], we have 

max(A7-^“i,fV*''-i)'' 


bias ^/(w) 


< C- 


, (//rY , i_ 

N \N) Ur 


logN 


and 


(/(w)) 


f 1 1 

(logA^)2 ^ (logA^)2(iVA)i/4 I 


( 20 ) 


( 21 ) 


Corollary 3. Let /(w) be our proposed spectral density estimator (15). Under 
the conditions (C.1)-(C.4), for ai = 1, b = b' = 1/2, and A = /uV^, there 
exists a constant C such that for all u G [0, oo), 

, 4/51 1 


MSE(/a(cc)) < C 




w/ v^(logfV)2_ ■ 


( 22 ) 
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Remark 3. In Corollary 3, we assume ai = 1 to simplify the discussion, which 
covers many commonly used covariance models, see Remark 2. HHCllb de¬ 
rived the asymptotic bounds for the bias and variance of HHC estimator of the 
spectral density on [0,u;c]. Here we extend that to the whole real line M. The 
first term and the second term are the same as that derived in HHCllb. The 
extra term {log N)~‘^) is due to the estimation of the tail behavior. 

The implications of (22) is the following: Assume the range of the sample path 
of A(s) is [0,r], where we have N = [Twc] observations with 
then MSE(/a(w)) is bounded by which is the same with the optimal 

rate of convergence of the smoothed periodogram estimator [37, 38]. 

4. Simulation study 

In this section we assess the performance of the proposed estimator, denoted 
by YZ, with irregular spatial data in a Monte Carlo study relative to the pre¬ 
viously proposed estimators, first the smoothing spline estimator as proposed 
in HHCllb, denoted by HHC, second a parametric estimator under the Matrn 
covariance model with parameters estimated by the maximum likelihood ap¬ 
proach, denoted by Matrn . We have two Model Setups, one with a Matrn 
covariance model, and the other one with a spherical covariance model. In both 
Model Setups, the parametric estimation procedure assumes a Matrn covariance 
model. Therefore it is correctly specified in the former while it is misspecified 
in the latter. We would like to assess the robustness of our proposed semi- 
parametric estimation procedure. For the non-parametric methods, previous 
simulations have found that HHC is superior to other approaches for irregular 
data in the literature, including a procedure introduced in [32] in terms of the 
mean-squared error of estimating spectral density (HHCllb). Here we focus on 
comparisons of the proposed estimator with HHC. 

4.1. Simulation setup 

We consider the spectral density estimation of a Gaussian process on the 
real line M, whose values are observed at random locations. 
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1. 


In Model Setup One, the covariance function is a Matrn covariance func¬ 
tion 


C{h) 





and the corresponding spectral density is 


/(w) = ' 


,r(^. + i/2) / I 


2iy 




-(‘^+ 1 / 2 ) 


where <j) = l,i^ =1/2 and = 1. 

2. In Model Setup Two, the covariance function is a spherical covariance 
function 


C{h) = 


otherwise 


and the corresponding spectral density is obtained by the inverse Fourier 
transformation 

/(w) = ^ / exp{—iujh)C{h)dh, 

Stt J 

where (j) = 1, and cr^ = 1. 


In the simulation, we consider sample sizes N to be 250, 500, and 1000. The 
process is observed at N locations that are i.i.d. uniformly distributed on the 
range [0, A^]. 


4-2. Estimation 

HHC estimator is fitted on the frequency interval [0, Wc] with the cutoff fre¬ 
quency Wc = TT. The smoothing parameter A is selected by generalized cross val¬ 
idation procedure. In VZ estimation, the empirical variograms are constructed 
with lags h < iV/1000, which serve as the building blocks in the regression esti¬ 
mator d. In the parametric estimation procedure, we fit the Matrn covariance 
function and the corresponding spectral density. We evaluate the performance 
of fitting the spectral density and the covariance function by the integrated 
squared error (ISE) [39]: 

ISE(/) = / {fiio) - fiio)fcLu, 

Jo 
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and 


rlOO 

ISE(C)= / {Cih)-C{h)ydh. 

Jo 

4-3. Spatial kriging 

To compare the kriging performance based on the estimated covariance func¬ 
tion, we consider Npred = 100 equally spaced locations inside the observation 
interval for prediction. Let Zo{s) be the predicted value at location s using 
the true covariance function Cq, and Z(s) be the predicted value with an esti¬ 
mated covariance function C. The prediction errors are eo(s) = Z{s) — Zq(s), 
and e(s) = Z{s) — Z{s), respectively. Let Eg denote the expectation under the 
true covariance function Cq. Then EqCq is the mean-squared prediction error 
(MSPE) of the best linear unbiased predictor or the kriging variance. It is easy 
to show that ifoe^(s)/ifoeg(s) = 1-1- Eo{Zk{s) — Zo(s))^/ifoeg(s). The second 
term on the right hand side represents the extra mean-squared prediction error 
introduced by predicting with an estimated covariance function instead of the 
true one. We refer to this term as the increase in prediction error at location s, 
denoted by IPE(s). We conduct 100 Monte Carlo simulations and compute the 
prediction performance measure as 

mIPE = median I - Z^^\si)]'^\si = 1, .. ■, Npred, j = I, ■ ■ ■, lOoj , 

with the superscript (j) indicating that the quantity is obtained from the j-th 
Monte Carlo sample. Smaller IPE value indicates a better kriging performance 
for the corresponding covariance function estimator. 

4-4- Simulation result 

Figure 1 visualizes the performance of spectral density estimation of HHC, 
YZ, and Matrn estimator with n = 250 and 500 in two Model Setups. From 
these figures, we can see that YZ estimator is always lying below LfflC estimator 
by correcting the aliasing problem. HHC tends to overestimate the spectral 
density at higher frequencies and YZ reduces this bias by adjusting for the 
aliasing effect. When sample size increases, both HHC a,nd YZ become closer to 
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the true spectral density. In Model Setup One where the data generating model 
uses a Matrn family, Matrn estimator does a very good job in estimating the 
spectral density function. This is expected since the model is correctly specified. 
However in Model Setup Two where the data generating model uses a spherical 
function, Matrn estimator tends to away from the true spectral density. 

Figure 2 visualizes the performance of covariance function estimation of 
HHC, YZ, and Matrn estimator with n = 250 and 500 in two Model Setups. The 
covariance function estimates from HHC method exhibit oscillation even when 
sample size is increased. By expression (10), the covariance function estimates 
from HHC method are infinitely differentiable at original and is a combination of 
sin functions, which contains oscillation. Whereas, the covariance function esti¬ 
mate from our proposed method is very close to the true covariance function and 
coverages to the true covariance function when sample size increases. Among 
the three methods HHC, YZ, and Matrn, the parametric approach Matrn is the 
best given the model is correctly specified; however its performance deteriorates 
if model is misspecified. 

Table 1 presents Monte Carlo median of ISE{f), ISE{C), and mIPE for 
three methods under two Model Setups. The performance of estimating spec¬ 
tral density for HHC and YZ are comparable {ISE{f) is similar for HHC and 
YZ). However, YZ outperforms HHC in terms of estimating covariance func¬ 
tion and spatial kriging {ISEiC) and mIPE are smaller for YZ than HHC). 
By estimating the tail behavior of the spectral density, YZ gains improvement 
in Kriging prediction. Again, the parametric approach Matrn is the best given 
the model is correctly specified; however its performance deteriorates if model 
is misspecified, which suggests that the parametric approach is not robust. 

5. Discussion Remarks 

In this paper we proposed a semi-parametric method to estimate spectral 
densities of isotropic Gaussian processes observed at irregular locations on 
The methodology can be adapted for spectral density estimation of spatial pro- 
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(a) Model Setup One with n=250. 


(b) Model Setup One with n=500. 



(c) Model Setup Two with n=250. 


(d) Model Setup Two with n=500. 


Figure 1: Spectral Density Estimation in Model Setup One and Two with n=250, 500. (a): 
Model Setup One with n=250; (b): Model Setup One with n=500; (c) Model Setup Two with 
n=250; (d) Model Setup Two with n=500. The black solid line is the true spectral density 
function; the red dashed line is HHC estimator; the blue dotted line is YZ estimator; and the 
green dashed line is the maximum likelihood estimator with a covariance model in a Matrn 
family.. 
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(a) Model Setup One with n=250. 


(b) Model Setup One with n=500. 



(c) Model Setup Two with n=250. 


(d) Model Setup Two with n=500. 


Figure 2: Covariance Function Estimation in Model Setup One and Two with n=250, 500. 
(a): Model Setup One with n=250; (b): Model Setup One with n=500; (c) Model Setup 
Two with n=250; (d) Model Setup Two with n=500. The black solid line is the true spectral 
density function; the red dashed line is HHC estimator; the blue dotted line is YZ estimator; 
and the green dashed line is the maximum likelihood estimator with a covariance model in a 
Matrn family.. 
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cesses that are stationary or intrinsic random processes on with d> Such 
extension will be addressed in a separate paper. 

The proposed estimator is in a closed form, so it does not require heavy 
numerical computation. Therefore it is feasible for large-scale spatial data. It 
also allows us to derive asymptotic bounds for the bias and variance of the 
proposed estimator, and to prove the estimator is consistent in theory. 

Our method builds on HHC estimator in HHCllb. The difference is that we 
additionally estimate the spectral density at high frequencies which has been 
ignored in HHCllb. The rationale is that the tail properties of the spectral 
function play a fundamental role in the prediction. Our simulation study shows 
that the proposed estimator outperforms HHC estimator in Kriging prediction. 

This semi-parametric method allows modeling the spectral density at low 
frequencies, and therefore it is more flexible than the fully parametric approach. 

Appendix A. Proof of Theorem 1 

Since in the estimator (13), we have involved log of the empirical variogram 
estimate instead of the empirical variogram estimate itself, we first derive the 
moment property for log Um- By Taylor expansion technique, 

1 r 2l 

E [log Ujn] ^ log E[Ujn] - ^^E {Um - E [Um]) 

2{E\Ujn\) L J 

= Oolog/i™-f 0(h“i)-f O . 

The second equality follows from (12) and the approximated variance of Um- 
Combining M individual terms, 

■ M 

E E log Ujn (log - log Hm) 

_m—l 

M 

- {«ologh„ + 0(/i“i)-f O(A-I)} (logh„-log V). 

m —1 
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Therefore, the square bias term can be derived: 

„ m 2 ,J2m=l{0{h^^)+0{N-^)}{loghm-loghM), 

{-£'( 0^0 Qio)} [ / - \2 j 

Em=l (log/lm-log^M) 


0(max(7V-2'’“bAf2b'-2)(jQg^)-2^ ^ 


and the variance term is 

Urn (\ogh 

m - log V) 

Var[ao] = - 2 - 

{Em=l (log/lm - log V)^} 

Em=lE^lC'0^ (log log M;) (log fe m log hj^f) (log hi — log h]^) 

{Em=l (log/im - logV)^} 


Cov(UmiUi) 

Z^m—l 2^1 — 1 E(u-m)E(u[) 

(log - log hj^ 

u 

'log hi - log hJ^f) 

i 

f / 

^2-^m—l V 

log hm - log hj^) 
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J2^=lT,Zl — I (log - log ^m) (log hi - log hM)\ 


{Em=l (log/im - logh^)^} 

m.\\Nl\E{Um.)E{ui) 

{E1=1 (log/lm-I^M)^} 




= o[n^'-^ {logN)~‘^Y 

Combining the squared bias term and the variance term, we have 
E (do — oo)^ = O ^max (log . 

Appendix B. Proof of Theorem 2 

Write the spectral density based on the gridized data as 

/.(.) = 1 f cos(^)C(^), 

(jJc 

k— — (X) 

for uj G [0,a;c]. From the aliasing problem, we also have /a(w) = E ^-00 
2 jwc)- 
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Firstly, we consider the bias of the spectral density estimator at the cutoff 
value ujc- From (9), 

|-E[/a.a(Wc)] - /a(Wc)| 

^ ^ n, °° k-TT 

= —1^(0)+ 2^-cos(fc7r)F;[5fc] - cos(A:7r)C'( —)|(B.l) 




Tik 


k— — c 


By Taylor expansion technique, 

E[Sk] = E[Xis^)X{sJ)] 

{ti ,tj)GLk 

= C'(|s*-stl) 

{ti ,tj)GLk 

= E i^(-) + - -)} 

LiJc UJ p 

{u,tj)eLk 

where ^ij,k € kTrluJcETrjuJc- Therefore (B.l) becomes 


\E[fA,\{uJc)] - /a(Wc)| 


-|C(0) + 2E 

OJc .. 


— ^^^—cos{kTT)C{ — )- E cos{kTr)C{ — ) 
nk + k^X Wc , ^ Wc 

k —1 k— — oo 


K 


< 


fc=l k^K+l 

K 


E 


nk 


k=i 




Wc 
— a —1' 


+ c^ - 

OJr 


< c 


N 


f^r+- 

\N/ UJr 


(B.2) 


where the third term in the second last inequality follows from the third condi- 
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tion and 


^ Jik + k‘^X , , 

k—l {ti,tj)^Lk 


< _ — _ Q( — ) 

Wc nfe + k'^X^^uJc 
k—l 




\^k<ujc k>ujc 

< 0 ( 1 ). 


So the bias of (j), which is the estimator of spectral density at the cut-off value 
can be derived as 

E /a,a(wc) /max(7V“''“i, 


{e [/A,A(cCe)] - fAicOe)} + E“-oo /((I + ^j)^c) 
' E“-oo|l + 2j|-'^ 

/ max(iV-*'“biV^'-i)\ 

^ ^ ^ , , 1 , , max(fV-'>-i,iv'>'-i)| 


- /(Wc) 


since E[^] = j + O ^max (jSf (log A^) from Theorem 1 and 0 < 

Er=-ooii+2ji-"<oo. 

Finally, to derive asymptotic bound for the bias of spectral density for w € 
[OjWc], we decompose the bias into three terms as follows, 


\bias(^f{uj)^\ = \E /(w) -/(w)| 

I OO 

= I {^^[/a.a(w)] - /a(w)| + < /a(w) - E + 2ja;c) 


E E 


Iw -I- 2 jujc 


< U 1 + U 2 + US, 
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where 


Ui 


= |i?[/A,AH] - /aMI 



by the same argument in (B.2), U 2 = /a(w) — /('^ + 2jwc) = 0, and by 

(B.3) 


U, =1 + 

i5^o i#o ^ 

\ujI /u^Y 1 max(Af-*'“i,iV^'-i) 

I^'^VTV/ logiV 

Thus, we have the first part result in Theorem 2, 

|&zas(/H)| < c|^ + (^) +^ + 

Since the bias of the spectral density estimator for uj e (wcoo) is always less 
than the bias at a; = Wc, the above bound can be applied for w e (wc, 00 ). 

To derive asymptotic bound for the variance of /(w) , we first consider the 
variance of /(w) when 7 is known. Let 


max(Af-^“i, fV 
logfV 








/ \cj + 2juJa \ \ 

\ u>^ J 

Er=-ooii+2ji- 


jj=-00 I 

Write 

K K 

var (/(w|7)) = II H bk^bk^cov {Ski, Sk^), 

fcl—0 ^2—0 

where bk = (cos(fc 7 ru;/wc) — a(a;) cos(fc 7 r)) / (n^ + 2 (A: 7 r)^A). Note that for all 
(jj e [0,a;], |a(a;)| < 1, use the same derivation in HHCllb , we have 

C 


(/(w|7) 


< 


y/N\' 


Now consider variance of /(w) = f{uj\j), since by Taylor expansion, 
/(w| 7 ) = /(w| 7 ) + ^/(a;| 7 )|..y=y (7 - 7 ), 
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where 7 ' is between 7 and 7 . Let Vi = df{uj\'-f)/d^\^^j' = C'/a 7 (wc), V 2 = 

7 — 7 , when Vi and V 2 are normally distributed, we have 

Var{ViV2) = [E{Vi)]‘^Var{V2) + [E{V2)]‘^Var{Vi) 

+ 2 E{Vi)E{V 2 )Cov{Vi,V 2 ) + Var{Vi)Var{V2) + Cov{Vi,V2f 

+0 (max{N-’’°‘\N’’'-^){\ogN)-^N^^'-^^/‘^ (logiV)”^ 

+0 (loglV)”^ 

< Ci(Af'’'-i(logAf)”^) 

+ O (max(iV-'’“i,lV'’'-i)(log7V)-2iv(f''-i)/2(^;^)-i/4^ 

< O (logA^)”^) +0 (iV-'’“i(logiV)-2Af(^'-i)/2(ArA)-i/^) 

Thus 

Var (^f{uj\j)^ = Var (^f(uj\j)+ ViV2^ 

= Var{f{uj\^)) + 2 Cov{f{uj\j), V1V2) + Var{ViV2)) 

" (loglV)2 + (loglV)2(iVA)l/4j' 
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Table 1: Monte Carlo Simulation Result in Model Setup One and Two. In Model Setup One, 
the data generating model uses Matrn Covariance. In Model Setup Two, the data generating 
model uses Spherical Covariance. The estimation methods include (l)HHC: the smoothing 
spline approach as proposed in HHCllb; (2)YZ: the proposed estimator in the paper (initial of 
the authors); (3) Matrn : maximum likelihood approach with a covariance model in a Matrn 
family. The evaluation measures include (A) ISE(f):the integrated squared error of spectral 
density function estimator; (B) ISE(C): the integrated squared error of covariance function 
estimator; (C) mIPE: the Monte Carlo median of the increase in prediction error . 


Model Setup One Model Setup Two 

with Matrn Covariance with Spherical Covariance 


n 


ISE{f) 

ISE{C) 

ml PE 

ISE{f) 

ISE{C) 

ml PE 


HHC 

0.0090 

0.1201 

56.650 

0.0312 

0.6880 

89.480 

250 

YZ 

0.0097 

0.0501 

0.1633 

0.0214 

0.0875 

1.2081 


Matrn 

0.0055 

0.0226 

0.1401 

0.0388 

0.1034 

1.3944 


HHC 

0.0078 

0.0836 

3.0119 

0.0188 

0.7420 

13.5128 

500 

YZ 

0.0075 

0.0558 

0.1133 

0.0164 

0.0866 

0.9217 


Matrn 

0.0039 

0.0205 

0.0550 

0.0244 

0.0967 

0.9781 


HHC 

0.0035 

0.0827 

1.3007 

0.0103 

0.7333 

10.046 

1000 

YZ 

0.0027 

0.0242 

0.1104 

0.0091 

0.0590 

0.8001 


Matrn 

0.0019 

0.0118 

1.5e-4 

0.0245 

0.0799 

0.9123 
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